SL2 Regression & Classification
Published:
Regression & Classification
1. Regression
1. Linear regression::
- Always has slope < 1 —> lean toward mean
- if f(x)=c, then c=mean —> can be proved by differentiating sum of square error
2. Polynomial regression:
- The higher the degree, the less the training error. But will overfit.
How to find the vector w for polynomial regression?
3. Regression error:
- Sensor error-device
- Malicious data
- Transcription error
- Unmodeled influences
4. What is cross validation:
- Fundamental Assumption of CV: IID— data are Independent and Identically Distributed. No inherent difference between training, test, and real world data.
- Randomly partition the training data into k folds of equal size.
- Train the model on all the (k-1) folds except for one
- Validate model’s performance using the left fold,
- Repeat using different combinations of folds.
- Average the error.
5. What will happen on the CV error when the degree of polynomial increases?:
- It will be higher than the training error when degree=0 — because there is no prior knowledge
- Then decrease as degree increases
- After a certain points, the CV error starts to decreases, since the model begins overfitting.